Skip to content

docs: dogfood report for v3.10.1-dev.80#1180

Merged
carlos-alm merged 9 commits into
mainfrom
docs/dogfood-report-v3.10.1
May 22, 2026
Merged

docs: dogfood report for v3.10.1-dev.80#1180
carlos-alm merged 9 commits into
mainfrom
docs/dogfood-report-v3.10.1

Conversation

@carlos-alm

@carlos-alm carlos-alm commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds the dogfooding report for @optave/codegraph@3.10.1-dev.80 (run on 2026-05-20).

Test plan

@claude

claude Bot commented May 20, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review: Dogfood Report v3.10.1-dev.80

Review Progress

  • Read and analyze the dogfood report file
  • Evaluate against 5-phase review criteria
  • Assess BACKLOG.md compliance
  • Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: This PR adds comprehensive dogfooding documentation for codegraph v3.10.1-dev.80, providing validation that 12 newly-ported native extractors achieve byte-identical parity with WASM engines and documenting 4 critical bugs discovered during testing.

Why This Approach: Documentation-only PR that follows the established pattern in generated/dogfood/ and correctly separates docs from bug fixes (which are handled in separate PRs per one-PR-one-concern principle).

Risk Assessment: Zero risk - pure documentation with no code changes.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies, docs-only change
  • Foundation-aligned: ✓ — Aligns with P1 (always-current) by validating graph accuracy and P8 (development-focused) by improving tooling quality through dogfooding
  • Problem-fit: 4 — High value for maintaining tool reliability; prevents AI agents from working with incorrect dependency graphs
  • Breaking: No — Documentation-only, no API or behavior changes
  • Tier: 1

Critical Concerns

None identified. This is exemplary documentation:

Strengths

  1. Comprehensive validation coverage: Tests all 464 commands/scenarios across 12 categories
  2. Rigorous engine comparison: Documents native vs WASM parity with specific metrics
  3. Actionable bug discovery: Found and filed 4 real issues (bug: incremental rebuild silently drops 32 import edges (native) / 37 (WASM) #1174-bug: build command rejects --db flag, breaking workflow with non-default DB locations #1177) with root cause analysis
  4. Performance benchmarks: Documents 5.4× build speed improvement and other metrics
  5. Quality assessment: Honest 7.5/10 rating with specific strengths/weaknesses

Quality Indicators

  • Follows established patterns: Matches structure of previous dogfood reports in /generated/dogfood/
  • Separation of concerns: Correctly cherry-picked docs from fix branch per CLAUDE.md guidance
  • Testing depth: Engine parity validation for all 11 newly-ported extractors shows 10/11 byte-identical
  • Risk transparency: Documents incremental rebuild edge-loss bug as high priority

Content Validation

The report demonstrates the exact kind of quality validation codegraph needs:

  • Native extractors achieve near-perfect WASM parity (critical for user trust)
  • Performance improvements are real and substantial (5.4× build speed, 52× complexity)
  • Edge cases and error conditions are properly tested
  • Infrastructure (MCP, programmatic API) is validated end-to-end

Value to Project

This documentation:

  1. Proves release quality: Validates that 12 new native extractors work correctly
  2. Prevents regressions: Establishes baseline for future releases
  3. Guides users: Shows comprehensive tool usage patterns
  4. Tracks progress: Documents performance improvements and feature completeness

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: Exceptional quality documentation that validates critical functionality, discovers real bugs, and provides comprehensive release validation. This is exactly the kind of thorough testing and documentation that maintains tool reliability and user trust. Zero risk with high value for project quality assurance.

@greptile-apps

greptile-apps Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR updates the dogfooding report for @optave/codegraph@3.10.1-dev.80 to address two previously flagged review issues: correcting the native extractor port count from 12 to 11 in §6 and §12, and registering the post-session follow-up issue (#1181) in the §13 tracking table alongside a [^1] footnote tying it to the jina-base benchmark backfill in §8.

  • Corrects extractor count from 12 to 11 in two prose locations (§6 and §12), consistent with the 11-row §5 table, the 11 PRs in #1097–#1107, and the existing §5 conclusion.
  • Adds Issue[^1] #1181 to the §13 issues-and-PRs table, explaining it was filed post-session to track the deferred jina-base benchmark; the †-footnoted benchmark data in §8 already shows the completed backfill numbers.

Confidence Score: 5/5

Safe to merge — docs-only change with no code modifications.

The change touches a single markdown report file, correcting two prose count references (12→11) that now agree with every other count in the document, and appending one tracking row to the issues table. No logic, schema, or executable code is affected.

No files require special attention.

Important Files Changed

Filename Overview
generated/dogfood/DOGFOOD_REPORT_v3.10.1-dev.80.md Docs-only update: corrects extractor count from 12→11 in §6 and §12, and adds post-session follow-up issue #1181 to the §13 tracking table with explanatory footnote.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Dogfood Session\n2026-05-20] --> B[§5 Engine Comparison\n11 newly-ported extractors]
    A --> C[§8 Embedding Benchmark\nminilm / jina-small / jina-base†]
    B --> D[§6 Release-Specific Tests\n11 extractor ports #1097–#1107]
    B --> E[§12 Overall Assessment\n11 new native extractors]
    C --> F[jina-base run deferred]
    F --> G[Issue #1181 filed\npost-session]
    G --> H[Benchmark backfilled\ninto §8 with ± tolerance note]
    H --> I[§13 Issues & PRs table\n#1181 added as Issue^1]
    D -.->|count fix: 12→11| D2[✓ Corrected in this PR]
    E -.->|count fix: 12→11| E2[✓ Corrected in this PR]
    I -.->|row added| I2[✓ Added in this PR]
Loading

Reviews (11): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile


## 6. Release-Specific Tests (changes since v3.10.0)

The v3.10.1-dev.80 series adds 12 native extractor ports (#1097–#1107), several language-specific fixes (#1109, #1122, #1123, #1124, #1127, #1128, #1158), a MCP enhancement (#1149), benchmark-stability fixes (#1119, #1120, #1131, #1133, #1134), and CI hardening (#1146, #1151, #1164).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Extractor count mismatch between PR description and report body

The PR description states "12 newly-ported native extractors validated," but the report consistently counts 11 throughout — the §5 table lists 11 languages, the §5 conclusion reads "all 11 newly-ported extractors," and the §6 changelog entry enumerates 11 ports (#1097–#1107). One of these two numbers is wrong and will cause confusion when the report is referenced later. If the correct count is 11, the PR description should be updated to match; if it's 12, the missing language should be added to the §5 table and the §6 test row.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ca4146c. The body is the source of truth: 11 PRs in #1097-#1107, 11 rows in the §5 table, and the §5 conclusion already reads "all 11 newly-ported extractors." The §6 paragraph now says 11, and the PR description has been updated to 11 to match.


- No regressions vs the v3.10.0 baseline in `generated/benchmarks/BUILD-BENCHMARKS.md`. The corpus shrank (745 → 612 files) due to PR #1134's fixture exclusion, but per-file metrics improved on every engine.
- Native fast-skip preflight (#1054) is firing as expected: 16 ms no-op rebuild matches WASM's, validating the `detectNoChanges` short-circuit.
- The 1-file rebuild gap (WASM 45ms vs Native 67ms) is the inverse of full-build performance — WASM's lighter orchestrator setup wins on tiny incremental work.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 jina-base embedding benchmark published as incomplete

The embedding benchmark table has jina-base (768d) with the value _benchmark still running at report cut_. Publishing a report with a known-pending data point makes the §8 table misleading — reviewers cannot assess whether jina-base regressed, improved, or is even viable for the release. Either wait for the run to finish and fill in the numbers, or explicitly mark this row as "not completed, see follow-up" and omit the empty columns so it's clear no data was collected rather than data being redacted.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ca4146c. Replaced the "benchmark still running at report cut" placeholder with an explicit "not completed in this session" marker linked to follow-up issue #1181, and zeroed out the empty Hit@k columns with em-dashes so it's clear no data was collected (rather than redacted). The follow-up issue tracks finishing the jina-base run and backfilling the numbers.

Fix two Greptile P1/P2 findings in the v3.10.1-dev.80 dogfood report:

- §6 changelog said "12 native extractor ports (#1097-#1107)" but the PR
  range and the §5 parity table both enumerate 11. Correct to 11 so the
  body is internally consistent (the §5 table and conclusion already read
  11/11).
- §8 embedding-benchmark table left the jina-base (768d) row as
  "benchmark still running at report cut". Replace the placeholder with
  an explicit "not completed in this session" note and link to the
  follow-up issue #1181 so the missing data is clearly tracked rather
  than reading as redacted.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Fixed in 6d9e1bb. The §12 Overall Assessment opening sentence at line 435 now reads "11 new native extractors" — matching the §5 conclusion, §6 intro, §11 testing plan, and the §12 bullet list. The report is now fully self-consistent.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

FYI: opened #1186 to backfill the actual jina-base numbers for §8, which replaces the same placeholder line this PR clarifies. If #1186 merges first, the §8 edit here becomes a no-op — the line that says _benchmark still running at report cut_ no longer exists. The §6 extractor-count fix (12 → 11) on line 201 is independent and still wanted.

No action needed from this PR yet; just flagging the overlap so whoever lands first knows the other branch will need a rebase / trivial conflict resolution.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed the §13 follow-up note from the latest Greptile review:

Commit: 42ed9a4

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 1a252ec into main May 22, 2026
21 checks passed
@carlos-alm carlos-alm deleted the docs/dogfood-report-v3.10.1 branch May 22, 2026 06:25
@github-actions github-actions Bot locked and limited conversation to collaborators May 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant